Skip to content

feat(index): Add fallback for community report extraction on provider…#2399

Open
Luotianyi-0712-tech wants to merge 1 commit into
microsoft:mainfrom
Luotianyi-0712-tech:fix/community-report-fallback
Open

feat(index): Add fallback for community report extraction on provider…#2399
Luotianyi-0712-tech wants to merge 1 commit into
microsoft:mainfrom
Luotianyi-0712-tech:fix/community-report-fallback

Conversation

@Luotianyi-0712-tech

Copy link
Copy Markdown

Problem

When I first used and configured the deepseek api, such an error occurred“Pipeline error: 'community'”,The reason is that GraphRAG's community report generation forces the use of JSON mode, which is incompatible with the DeepSeek API.
The create_community_reports workflow fails entirely when using LLM providers that do not support response_format with Pydantic models (i.e., json_schema mode). This includes popular providers like DeepSeek, which only supports {"type": "json_object"} and returns:BadRequestError: This response_format type is unavailable now
When all community report requests fail, the pipeline crashes with KeyError: 'community' because community_reports DataFrame is empty.

Solution

Implement a three-tier fallback strategy in CommunityReportsExtractor:

  1. Primary: Try structured output with Pydantic response_format (best quality, works on OpenAI/Azure/Anthropic).
  2. Fallback 1: If provider rejects json_schema, catch the error and retry with {"type": "json_object"} (works on DeepSeek, Gemini 1.5, etc.).
  3. Fallback 2: If even json_object is rejected, fall back to plain text completion and extract JSON manually via regex (works on Ollama and other minimal providers).

Provider Support Matrix

Provider json_schema (Pydantic) json_object Plain Text
OpenAI / Azure
Anthropic
Gemini 2.0+
Gemini 1.5 ❌ (litellm client validation)
DeepSeek
Ollama partial partial

Changes

  • community_reports_extractor.py: Added _is_unsupported_response_format_error(), _parse_json_from_text(), and three-tier try/catch logic.
  • No breaking changes. Existing users on OpenAI/Azure continue to use json_schema without any behavior change.

Testing

  • Tested with OpenAI (gpt-4o-mini) — uses json_schema path
  • Tested with DeepSeek (deepseek-v4-flash) — falls back to json_object path

@Luotianyi-0712-tech Luotianyi-0712-tech requested a review from a team as a code owner June 13, 2026 03:34
@Luotianyi-0712-tech

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant